Labrador
- North America > Canada > Newfoundland and Labrador > Labrador (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- North America > United States > Texas (0.14)
- North America > Canada > Ontario > National Capital Region > Ottawa (0.13)
- North America > Canada > Ontario > Toronto (0.13)
- (44 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- (2 more...)
- Transportation > Passenger (1.00)
- Transportation > Air (1.00)
- Leisure & Entertainment (1.00)
- (25 more...)
- North America > United States > Texas (0.14)
- North America > Canada > Ontario > National Capital Region > Ottawa (0.13)
- North America > Canada > Ontario > Toronto (0.13)
- (44 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- (2 more...)
- Transportation > Passenger (1.00)
- Transportation > Air (1.00)
- Leisure & Entertainment (1.00)
- (24 more...)
- Asia > Middle East > Jordan (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Virginia (0.04)
- (17 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- (8 more...)
Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning
Zhang, Yi, Cheng, Chun-Wun, He, Junyi, Yu, Ke, Tang, Yushun, Schönlieb, Carola-Bibiane, He, Zhihai, Aviles-Rivero, Angelica I.
Abstract--Recent research in Vision-Language Models (VLMs) has significantly advanced our capabilities in cross-modal reasoning. However, existing methods suffer from performance degradation with domain changes or require substantial computational resources for fine-tuning in new domains. T o address this issue, we develop a new adaptation method for large vision-language models, called Training-free Dual Hyperbolic Adapters (T -DHA). We characterize vision-language relationship between semantic concepts, which typically has a hierarchical tree structure, in the hyperbolic space instead of the traditional Euclidean space. We find that this unique property is particularly effective for embedding hierarchical data structures using the Poincar e ball model, achieving significantly improved representation and discrimination power . Coupled with negative learning, it provides more accurate and robust classifications with fewer feature dimensions. Our extensive experimental results on various datasets demonstrate that the T -DHA method significantly outperforms existing state-of-the-art methods in few-shot image recognition and domain generalization tasks. ARGE Vision-Language Models (VLMs), such as CLIP [1] and ALIGN [2], are trained on extensive image-text datasets using contrastive learning. These models excel in creating a unified vision-language embedding space by aligning visual and textual modalities, enabling their successful application across a wide range of downstream visual tasks, such as few-shot image recognition [3]-[5].
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > Newfoundland and Labrador > Labrador (0.05)
- (3 more...)
- Research Report > New Finding (0.68)
- Research Report > Promising Solution (0.66)
Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions
Park, Yoonah, Pyun, Haesung, Jo, Yohan
Large Language Models (LLMs) often fail on multiple-choice questions (MCQs) despite demonstrating correct knowledge in other contexts, such as free-form generation. To investigate the mechanism underlying this knowledge-prediction gap on MCQs and alleviate it, we conduct a probing analysis and find that residual streams in certain layers contain a subspace spanned by two important bases: a \emph{knowledge basis} that encodes the probability of the ground-truth answer for a given MCQ and a \emph{prediction basis} that encodes the probability of the answer choice predicted by the model. We observe that incorrect predictions arise from a misalignment of the model's hidden states along these two bases. Hence, we introduce \textbf{KAPPA} (Knowledge-Aligned Prediction through Projection-based Adjustment), a parameter-free intervention that transforms the hidden states to align the prediction coordinate with the knowledge coordinate within this subspace. Experiments on binary-choice reformulations of Big-Bench-Hard and ARC-Challenge show that KAPPA substantially improves accuracy and consistently outperforms baselines. While optimal subspaces differ across tasks, subspaces generalize to some extent, as supported by cross-dataset experiments. Moreover, KAPPA extends its effectiveness to free-form questions beyond MCQs. Our work provides a new geometric understanding of the knowledge-prediction gap and offers a practical method for better aligning model behavior with its latent knowledge.
Coefficient of Variation Masking: A Volatility-Aware Strategy for EHR Foundation Models
Fani, Rajna, Attrach, Rafi Al, Restrepo, David, Jia, Yugang, Celi, Leo Anthony, Schüffler, Peter
Masked autoencoders (MAEs) are increasingly applied to electronic health records (EHR) for learning general-purpose representations that support diverse clinical tasks. However, existing approaches typically rely on uniform random masking, implicitly assuming all features are equally predictable. In reality, laboratory tests exhibit substantial heterogeneity in volatility: some biomarkers (e.g., sodium) remain stable, while others (e.g., lactate) fluctuate considerably and are more difficult to model. Clinically, volatile biomarkers often signal acute pathophysiology and require more sophisticated modeling to capture their complex temporal patterns. We propose a volatility-aware pretraining strategy, Coefficient of Variation Masking (CV-Masking), that adaptively adjusts masking probabilities according to the intrinsic variability of each feature. Combined with a value-only masking objective aligned with clinical workflows, CV-Masking yields systematic improvements over random and variance-based strategies. Experiments on a large panel of laboratory tests show that CV-Masking enhances reconstruction, improves downstream predictive performance, and accelerates convergence, producing more robust and clinically meaningful EHR representations.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- North America > Canada > Newfoundland and Labrador > Labrador (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Health & Medicine > Diagnostic Medicine (0.91)
- Health & Medicine > Health Care Technology > Medical Record (0.70)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models
Newbury, Rhys, Zhang, Juyan, Tran, Tin, Kurniawati, Hanna, Kulić, Dana
Understanding and representing the structure of 3D objects in an unsupervised manner remains a core challenge in computer vision and graphics. Most existing unsupervised keypoint methods are not designed for unconditional generative settings, restricting their use in modern 3D generative pipelines; our formulation explicitly bridges this gap. W e present an unsupervised framework for learning spatially structured 3D keypoints from point cloud data. These key-points serve as a compact and interpretable representation that conditions an Elucidated Diffusion Model (EDM) to reconstruct the full shape. The learned keypoints exhibit repeatable spatial structure across object instances and support smooth interpolation in keypoint space, indicating that they capture geometric variation. Our method achieves strong performance across diverse object categories, yielding a 6 percentage-point improvement in keypoint consistency compared to prior approaches.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada > Newfoundland and Labrador > Labrador (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- (2 more...)
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Shani, Chen, Soffer, Liron, Jurafsky, Dan, LeCun, Yann, Shwartz-Ziv, Ravid
Humans organize knowledge into compact conceptual categories that balance compression with semantic richness. Large Language Models (LLMs) exhibit impressive linguistic abilities, but whether they navigate this same compression-meaning trade-off remains unclear. We apply an Information Bottleneck framework to compare human conceptual structure with embeddings from 40+ LLMs using classic categorization benchmarks. We find that LLMs broadly align with human category boundaries, yet fall short on fine-grained semantic distinctions. Unlike humans, who maintain ``inefficient'' representations that preserve contextual nuance, LLMs aggressively compress, achieving more optimal information-theoretic compression at the cost of semantic richness. Surprisingly, encoder models outperform much larger decoder models in human alignment, suggesting that understanding and generation rely on distinct representational mechanisms. Training-dynamics analysis reveals a two-phase trajectory: rapid initial concept formation followed by architectural reorganization, during which semantic processing migrates from deep to mid-network layers as the model discovers increasingly efficient, sparser encodings. These divergent strategies, where LLMs optimize for compression and humans for adaptive utility, reveal fundamental differences between artificial and natural intelligence. This highlights the need for models that preserve the conceptual ``inefficiencies'' essential for human-like understanding.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Newfoundland and Labrador > Labrador (0.04)
- (5 more...)
CURENet: Combining Unified Representations for Efficient Chronic Disease Prediction
Dao, Cong-Tinh, Phan, Nguyen Minh Thao, Ding, Jun-En, Wu, Chenwei, Restrepo, David, Luo, Dongsheng, Zhao, Fanyi, Liao, Chun-Chieh, Peng, Wen-Chih, Wang, Chi-Te, Chen, Pei-Fu, Chen, Ling, Ju, Xinglong, Liu, Feng, Hung, Fang-Ming
Electronic health records (EHRs) are designed to synthesize diverse data types, including unstructured clinical notes, structured lab tests, and time-series visit data. Physicians draw on these multimodal and temporal sources of EHR data to form a comprehensive view of a patient's health, which is crucial for informed therapeutic decision-making. Yet, most predictive models fail to fully capture the interactions, redundancies, and temporal patterns across multiple data modalities, often focusing on a single data type or overlooking these complexities. In this paper, we present CURENet, a multimodal model (Combining Unified Representations for Efficient chronic disease prediction) that integrates unstructured clinical notes, lab tests, and patients' time-series data by utilizing large language models (LLMs) for clinical text processing and textual lab tests, as well as transformer encoders for longitudinal sequential visits. CURENet has been capable of capturing the intricate interaction between different forms of clinical data and creating a more reliable predictive model for chronic illnesses. We evaluated CURENet using the public MIMIC-III and private FEMH datasets, where it achieved over 94\% accuracy in predicting the top 10 chronic conditions in a multi-label framework. Our findings highlight the potential of multimodal EHR integration to enhance clinical decision-making and improve patient outcomes.
- North America > United States > Utah (0.14)
- Asia > Taiwan (0.04)
- North America > United States > Michigan (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.88)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)